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ABSTRACT 

Adaptive HTTP streaming with centralized consideration of 
multiple streams has gained increasing interest. It poses a 
special challenge that the interests of both content provider 
and network operator need to be deliberately balanced. More 
importantly, the adaptation strategy is required to be flexible 
enough to be ported to various systems that work under dif¬ 
ferent network environments, QoE levels, and economic ob¬ 
jectives. To address these challenges, we propose a Markov 
Decision Process (MDP) based network-assisted adaptation 
framework, wherein cost of buffering, signiflcant playback 
variation, bandwidth management and income of playback 
are jointly investigated. We then demonstrate its promising 
service provisioning and maximal profit for a mobile network 
in which fair or differentiated service is required. 

1. INTRODUCTION 

Thanks to strong scalability and versatility, HTTP adaptive 
streaming has established itself as the dominant technique 
for Internet video delivery, and is expected to stay as a ma¬ 
jor component of video delivery in the future Internet. The 
most challenging research task for HTTP adaptive stream¬ 
ing is the rate selection process, that is, which quality ver¬ 
sion of a segment should be streamed. Many works dlH 
have been proposed towards the optimal adaptation conducted 
by the client itself using local measurements and estimation. 
Although these rate adaptation algorithms showed promis¬ 
ing performance in single-user networks, they cannot be di¬ 
rectly applied to multi-client mobile networks with shared 
bandwidth bottleneck. Playback instability and unfairness 
have been commonly identified as among the weaknesses for 
client-side adaptation in practical experiments where two or 
more clients compete for the same dynamic bottleneck EH. 
In fact, the fundamental problem behind these issues is that 
client-side adaptation naively assumes the wireless network 
is utilized by the client itself and has no knowledge of other 
competing streams in the shared bottleneck. 

Therefore, it is natural for the network operator to arbi¬ 
trate such competition and globally adapt the clients’ deci¬ 
sions by considering multiple players’ status and the overall 
network environment. MPEG and 3 GPP have included such 
ideas into its working draft II or technical report O in or¬ 
der to further enhance the quality of experience (QoE) of In¬ 


ternet streaming. In addition to performance improvement, 
there are also economic incentives for a network operator to 
join the adaptive streaming ecosystem. Eor instance, content 
provider Netflix has recently entered a peering agreement on 
smooth streaming with network operator Comcast (T). One of 
the most critical challenge, however, is how to balance the in¬ 
terests of a content provider and a network operator. Indeed, 
a content provider always wants its users to experience satis¬ 
factory QoE whereas a mobile operator sometimes might be 
more concerned about its bandwidth cost under congestion. 
Hence, the adaptation strategy shall be able to guarantee the 
users’ QoE, as well as maximizing the operator’s profit. On 
top of this important challenge, it is also crucial to design the 
joint adaptation framework in a flexible and systematical way 
such that the framework can be easily ported to various ap¬ 
plications and network environments. Eor example, there are 
diverse types of service provision, such as fair experience ver¬ 
sus differentiated QoE, high-quality playback versus highly- 
smooth playback, and different cost models of network oper¬ 
ators, such as bandwidth-centric versus user-number-centric. 

Only a few research works have been focused on Inter¬ 
net streaming with competing clients. Some exploited traffic 
shaping mechanisms in the server side to orchestrate the expe¬ 
rience of two competing users l[3R1l . Others aimed at optimiz¬ 
ing various kinds of QoE utility, such as rate-resolution utility 
n and QoE continuum 0 , by developing individual optimal 
or heuristic algorithm. One unique approach called WiDASH 
lITOll inserts an additional proxy in between the server and the 
wireless access networks to implement a split-TCP architec¬ 
ture. Nonetheless, all these works concentrate on a specific 
QoE objective only from the perspective of content providers, 
which limits their applicability. 

The major contribution of this research is in the de¬ 
velopment of a network-assisted adaptive HTTP streaming 
framework for multiple clients competing for shared band¬ 
width over the same bottleneck link. The scheme can flexibly 
achieve different system-level service requirements by tuning 
certain framework parameters and can maximize the eco¬ 
nomic profit of the operator at the same time. Specifically, 
we propose a Markov Decision Process (MDP) based frame¬ 
work to carry out the adaptation decision, where we jointly 
consider the clients’ bandwidth and service requirements, 
and thoroughly investigates the operator’s economic benefits. 
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Fig. 1. Architecture of the proposed mobile streaming system. 


In the following, we focus on mobile cellular networks to 
introduce the framework and we describe the system from 
the point of view of mobile operators. Then we demonstrate 
two case studies where service fairness and service differ¬ 
entiation (for premium users and regular users) is required 
by a content provider within an operator-controlled mobile 
network. These have been identified as major use cases of 
network-assisted streaming by MPEG 0. However, it should 
be noted that the applicability of the framework is general and 
can be properly adapted to other service requirements. 

2. MDP-BASED NETWORK-ASSISTED 
ADAPTATION FRAMEWORK 

We consider a cellular Internet streaming system as shown in 
Fig. □ We target one cell, where hi is the set of users and 
each user is indexed hy i = ^N. There are a set of 

videos V with different quality, characterized by bit-rate Vj 
where j = 1, 2, • • • , M. Each quality version is split into 
Tse^-second segments. We shift the intelligence from clients 
to a logically centralized controller within mobile operator’s 
network, i.e., the Application Eunction (AE) within 3GPP net¬ 
works 0, for the network-assisted adaptation. 

The proposed system works as follows. The video server 
initially sends out a description of video versions. At each 
switching point that occurs every Tseg seconds, a user re¬ 
quests a segment based on local throughput measurements 
that need very low complexity. Such request only serves as a 
preliminary suggestion and therefore no corresponding local 
optimization in the client is carried out. In other words, there 
would be no conflict if this request is modified later. Unlike 
conventional client-side adaptation where the mobile opera¬ 
tor simply forwards the client requests to the video server, 
the mobile operator here will re-write the request based on 
MDP-based cell-wide adaptation done by the AE. The client 
feedback for adaptation, such as available video representa¬ 
tions and bandwidth measurement, can be feasibly communi¬ 
cated to the AE. Eor instance, it can be achieved by standard¬ 
ized quality metrics reporting process 111 or new techniques 
like URL parameter insertion ifT^ . The final adaptation de¬ 
cision then will be delivered to the cellular scheduler and the 
video server. We adopt a rate proportional scheduling wherein 
the downlink resources are allocated based on the upper layer 
video rate. That way, the video version decided from system’s 
perspective can be effectively streamed to the users. 

2.1. Wireless Bandwidth Model 

We model the last-hop cellular link for each client as a 
Markov process, which has been widely studied and proved 


to be effective ca. Each link has K states of available 
bandwidth, namely B = {bwk\k = 1, 2, • • • , K}. 

The clients measure their own downlink bandwidth and 
periodically report it to AE, which serves as the input to the 
adaptation framework. Note that the measured bandwidth 
may not be exactly the value of bwk . We thereby divide the 
available bandwidth domain into K regions and map the real 
bandwidth to the corresponding region. 


2.2. Mobile Operator Profit Model 

Normal Playback Income. At switching point f, the mobile 
operator is expected to obtain income Ipiay from the content 
provider if it can guarantee a normal playback with rate 
during the upcoming period. Eormally, this implies < 
where Rt-\-i G V and G S is the estimated 

bandwidth of link. Considering the logarithmic relationship 
in rate distortion theory, we model the income from client i as 

Ri,t+1 

( 1 ) 
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where a (0 < a < 1) is the sharing weight that normal play¬ 
back places on QoE and Rmin = min{rj|rj G V}. We de¬ 
note the normalization factor as riniav = log ^^nax ^ where 

^ ^ -txmin 

Rmax = max{rj|rj G V}. Tpiay is an indicator function for 
the normal playback, namely Tpiay = 1 if Rt-\-i < 
and Tpiay = 0 otherwise. 

Playback Buffering Cost If the operator-adapted bit-rate 
exceeds the available bandwidth, playback buffering would 
occur, which should not be encouraged in the service contract. 
Hence, we impose certain penalty cost C^uf on player i. 


Ci,huf — (1 Tpiay^P 
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where (3 (0 < (3 < 1 ) is the re-buffering weight, and 6 min = 
min{rj — bwk\rj G V^bwk G B}. The normalization fac¬ 
tor of playback buffering is piju f = log ^^nax-EWmin ^ where 

^min 

BWmin = mm{bwk\bwk G B}. 

Playback Smoothness, If the rate variation in two con¬ 
secutive periods is too significant, the content provider would 
penalize the mobile operator some money C^ar- 
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where 7 (0 < 7 < 1) is the smoothness weight and a + 
-h 7 = 1, A is the rate variation threshold below which no 
penalty would occur, and p^ar = log is the nor¬ 

malization factor. We can thus express the indicator function 
Tvar Tyar ~ 1 if Rt — Rt-\-i > A and Tyar = 0 otherwise. 

Bottleneck Bandwidth Cost Einally, the mobile cellu¬ 
lar network may experience radio congestion, during which 
the mobile operator would have to spend proportionally more 
bandwidth cost to manage the network: 

Cbw — Tbw^i^^ Ri,t-\-l Rth) 

ieu 


( 4 ) 











where Rth is the total service rate constraint above which the 
mobile operator has to spend extra money, 0 is the cost per 
unit exceeded rate, = 1 when > Rth, and 

otherwise = 0 . 

2.3. Markov Decision Process Formulation 

MDP is a 4-tuple reinforcement learning task ifHl that can 
intelligently interact with the uncertain bandwidth estimation 
and make the adaptation decision accordingly. 

System States, The system state at time period t can be 
defined as St = (Rt, BWt), where the vector represents the 
bandwidth and quality version for all clients in U. Besides, 
the system state set S = {si, S 2 , • • • , st} essentially keeps 
track of the evolution of the entire streaming system, where 
T is session duration. Since the system state depends only on 
its most recent (previous) state, the Markov property holds. 

System Actions, The action set ^ = {ai, a 2 , • • • , ^t} 
defines the operator-selected quality version of video for all 
the users. In particular, at = Rt+i, denotes the quality ver¬ 
sion to be streamed in the upcoming adaptation period. 

State Transitions, The state transition from st to St+i is 
determined by the decided quality version and available band¬ 
width at time period t. Given that the mobility-decided band¬ 
width variation is independent of bit-rate decisions, the state 
transition probability Vat can be derived as, 

Vat 5 ~ 1 } 

= Pr{ (Rt+i, BWt+i) I (Rt, BWt)} 

= Pr{Rt+i|Rt, Rt+i = <^t}Pr{BWt+i |BWt} 

(5) 

where Pr{BWt+i|BWt} can be obtained from the transi¬ 
tion matrix of the channel model and Pr{Rt+i |Rt, Rt+i = 
at} is decided by the action policy. 

Profit Function and Optimization Objective, The profit 
function IZaA^t^ is the overall monetary profit of the 
mobile operator from St to i.e., 

R'at^^t’! ^ ^ {^i,play Ci,buf Ci,var^ ^bw ( 6 ) 

i^lA 

where is the service priority coefficient for user i and 
= 1- Hence, the optimization objective of the 
network-assisted streaming is that we seek to take the optimal 
policy TT such that the profit of mobile operator during the 
T-second session can be maximized. Thus the objective is 
OBJ = max^ Sjlq st+i). 

Framework Flexibility, All the above models can be cus¬ 
tomized for the mobile operator or content providers. Im¬ 
portantly, these modified frameworks can still be optimally 
solved as introduced next because they only lead to a differ¬ 
ent profit function IZat 

2.4. MDP Solution 

We propose a dynamic programming based algorithm to solve 
the profit-maximized rate adaptation problem. We first let 


i;( 5 t) be the maximal expected profit from St to the end state 
St- Based on Bellman value iteration ca, we have 

v(st)=max{ V Vat{st,St+i){TZat{st,St+i)+v{st+i))} 

st+iGS 

where S = x is the state space. This essentially 
means, for a given current state G S and a given action at, 
we have multiple possible next states St+i G S and need to 
compute the expected rewards for all the transitions. Then we 
can obtain the optimal rewards by selecting the action achiev¬ 
ing the highest expected rewards. By substituting v{st) = 0, 
the iteration will have a initial value. Iteratively, we can com¬ 
pute the optimal v(so) (equal to the objective OBJ) for all 
possible states sq G S and accordingly the optimal policy. 

TT = argmaxiy^ 7^a(<so, 5l)(7^a(5o, <si) +'i^(5i))} (8) 

a ^^ 

siGS 

Algorithm 1 Dynamic Programming Adaptation Algorithm 
1 : Initialization: t ^ T — 1, i;(st+i) ^ 0 
2: while t > 0 do 
3: for all possible st G S do 

4 : v(st) <- maXaAEst+iest’atht, St+l)(TZatht, St+l) 

5: +i;(st+i))} 

6 : v(st+l) ^ l'(st) 

7: t ^ — t — 1 

8 : Output the optimal deterministic policy tt. 

An table that maps each state to the optimal quality ver¬ 
sion would be generated. Then the mobile operator can make 
the decision by looking up the table at each switching point. 

We summarize the iterative algorithm in Algorithm [U 

3. PERFORMANCE EVALUATIONS 

We evaluate the proposed systems using two case studies, i.e., 
fair services and differentiated services, with two compet¬ 
ing users using ns-2. We use H.264/AVC encoded sequence, 

“Big Buck Bunny”. The bit-rate of 5 quality levels is 95.11, 
183.53, 364.63,493.02, and 798.09 Kbps, respectively. 

We use a four-state Markov last-hop channel. Each state 
lasts for 1 second. We also select segment length Tseg to be 
1 second to correspond to the channel coherence time. Since 
we are interested in the service provisioning by the mobile 
operator, we adopt the same Markov model for all users to 
eliminate the impacts of channel differentiation. We adopt 
a Markov channel matrix used in O to simulate the cellular 
link, i.e., Pt = [.5, .5, 0, 0; .2, . 6 , .2, 0; 0, .1, .7, .2; 0, 0, .2, . 8 ]. 

The bandwidth region boundaries for the four state are 256, 

512, and 896 Kbps. We use the core network settings in ifTOl . 

Regarding the profit model, we assume the overall ser¬ 
vice rate constraint Rth is 850 Kbps. The congestion penalty 
0 is set to oc, which indicates that aggressive bit-rate selec¬ 
tion is completely forbidden. We set the sharing weight a, p 
and 7 as 0.3, 0.5, and 0.2, respectively, for higher emphasis 
of buffering. We have run simulations with different sets of 
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Fig. 2. The session profit of mobile operator versus a) service 3. Playback bit-rate versus time of a) the proposed sys- 

rate constraint Rth during congestion; b) session duration T. tern; b) conventional client-centric system. 


weights, and found them insensitive to service provisioning. 
However, the actual profit of operator would be varied, which 
indicates mobile operators’ potential tradeoff between profit 
and service agreement. The threshold for rate consistency A 
is set to be 350 Kbps. The service priority coefficient A is 
set to be 0.5/0.5 and 0.7/0.3 for the two users in the fair and 
differentiated services case, respectively. The initial quality 
are the lowest version and the player’s buffer is set to be 80 
frames. The video session runs 200 seconds for 15 times. 

3.1. Fair Services Provisioning 

In this section, we evaluate the performance of different adap¬ 
tation algorithms when applied in the network-assisted sys¬ 
tem, where fair service provisioning is required: 

i) Upper bound algorithm (referred as Ideal): Same as Al¬ 
gorithm [TJ except that future channel condition is perfectly 
known by the mobile operator such that the MDP transition 
probability becomes 1.0 when considering an action. Thus 
we can compute the actual rather than the expected rewards. 

ii) Myopic algorithm (referred as Myopic): The network- 
assisted system directly employs the local throughput-based 
adaptation decision from clients. 

We show the total profit of mobile operator versus service 
rate constraint in Fig. [2al We also demonstrate the profit ver¬ 
sus session durations in Fig. [2bl As shown in the figures, the 
proposed algorithm achieves a similar performance as Ideal. 
This actually demonstrates the insensitivity of the proposed 
algorithm to channel model accuracy, which generalizes its 
applicability. Additionally, the proposed algorithm shows a 
superior performance over Myopic because Myopic, without 
studying the profit factors, may decide a high bit-rate that 
causes client re-buffering or too significant rate variations. 

3.2. Differentiated Services Provisioning 

We now evaluate the performance of proposed system under 
the requirement of service differentiation. Since we obtained 
a similar profit comparison curve as Fig. [2] for the differenti¬ 
ated services case, we will focus on evaluating other playback 
metrics, compared with conventional client-centric system. 

Fig. [3] shows playback bit-rate in one session. The high- 
priority user (UEl) achieves a significant higher rate in the 
network-assisted system. This is because the mobile operator 
consider the differentiated service by tuning the priority coef- 
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Fig. 4. Buffer occupancy versus time of a) the proposed sys¬ 
tem; b) conventional client-centric system. 


Table 1. Playback Quality Metrics 


UE1/UE2 

Average bit-rate 

Buffering ratio 

Occurrence of buffering 

Proposed 

303.95/ 150.18 (Kbps) 

0/17.71% 

0 / 4.43 (per second) 

Client-centric 

260.64 / 266.06 (Kbps) 

18.69%/18.79% 

4.67 / 4.70 (per second) 


ficient, accordingly making service-aware adaptation. How¬ 
ever, users enjoy a similar playback rate in client-centric sys¬ 
tem since they essentially compete for the shared bandwidth. 
We also found that significant rate variations (>350 Kbps) for 
the proposed system takes more than 5 segments and there is 
no variations within one segment exceeding the threshold. 

For buffer occupancy as shown in Fig. [H we observe that 
the high priority user never suffers buffer underfiow in the 
network-assisted system even though congestion occurs. This 
is at the expense of performance degradation of the low prior¬ 
ity user, which is acceptable based on service differentiation 
agreement. In the client-centric system, nonetheless, the com¬ 
peting players is frequently re-buffered. 

We also show industry-standard performance metrics ca 
in Table [B We observe that the proposed system generally 
outperforms the client-centric system. It can also satisfacto¬ 
rily meet the service differentiation requirement. 

4. CONCLUSION 

We proposed a generalized MDP-based adaptation frame¬ 
work for network-assisted mobile streaming by considering 
the playback, bandwidth, and economic factors in the mo¬ 
bile operator’s view. With two case studies, the proposed 
framework is shown to outperform conventional client-side 
adaptation in terms of service provisioning. It also achieves 
near-ideal profit for the operator. Despite of potential com¬ 
plexity of MDP, the burden can be significantly released when 
implemented in the operator’s proxy clouds. Future work can 
be focused on extending the framework to a larger user scale. 
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